Medpie: an Information Extraction Package for Medical Message Board Posts

نویسندگان

  • Adrian Benton
  • John H. Holmes
  • Shawndra Hill
  • Annie Chung
  • Lyle H. Ungar
چکیده

SUMMARY We have developed medpie, a software package for preparing medical message board corpora and extracting patient mentions and statistics for drugs, herbs and adverse effects experienced from them. The package is divided into web-crawling, HTML-cleaning, de-identification and information extraction modules. It also includes a sample controlled vocabulary of drugs, herbs and adverse effect terms. AVAILABILITY http://www.cis.upenn.edu/~ungar/medpie.zip. DEPENDENCIES Python 2.6 or 2.7.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classifying Message Board Posts with an Extracted Lexicon of Patient Attributes

The goal of our research is to distinguish veterinary message board posts that describe a case involving a specific patient from posts that ask a general question. We create a text classifier that incorporates automatically generated attribute lists for veterinary patients to tackle this problem. Using a small amount of annotated data, we train an information extraction (IE) system to identify ...

متن کامل

Identifying potential adverse effects using the web: A new approach to medical hypothesis generation

Medical message boards are online resources where users with a particular condition exchange information, some of which they might not otherwise share with medical providers. Many of these boards contain a large number of posts and contain patient opinions and experiences that would be potentially useful to clinicians and researchers. We present an approach that is able to collect a corpus of m...

متن کامل

Identifying Information in Stock Message Boards and Its Implications for Stock Market Efficiency

The information value of stock message boards has often been debated. A main difficulty in assessing the value is the presence of a large number of posts with varying quality. This paper presents an intuitive approach to identify and aggregate information in stock message boards. We weigh each post’s recommendation by its author’s credibility based on accuracy of his past posts. We find that th...

متن کامل

Classifying Sentences as Speech Acts in Message Board Posts

This research studies the text genre of message board forums, which contain a mixture of expository sentences that present factual information and conversational sentences that include communicative acts between the writer and readers. Our goal is to create sentence classifiers that can identify whether a sentence contains a speech act, and can recognize sentences containing four different spee...

متن کامل

A Linguistic Analysis of the Online Debate on Vaccines and Use of Fora as Information Stations and Confirmation Niche

This study looks at the communication between users concerning health risks, with the aim of exploring their use of fora and assessing whether participants establish a niche with like-minded users during these exchanges. By integrating a corpus linguistic approach with content analysis and multiple studies on computer mediated health discourse, this study analyses the intense attention paid to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 28 5  شماره 

صفحات  -

تاریخ انتشار 2012